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Xia and Tong have made a novel contribution to 
the debate on whether and how to carry out some 
sort of feature matching in preference to a statisti- 
cally efficient alternative such as the maximum like- 
lihood estimate (MLE). They show that an estima- 
tion criterion emphasizing long-term predictions has 
some advantages over the MLE on some misspecified 
time series models. However, emphasizing long-term 
predictions must lead to a down- weighting of higher- 
frequency information in the data. In particular, 
Xia and Tong's catch-all approach does not typi- 
cally share the statistical efficiency of MLE when 
the model fits the data adequately. Further, it is nec- 
essarily the case (whatever fitting method is used) 
that some scientific inferences one might wish to 
conclude from fitting a misspecified model are statis- 
tically invalid. Scientific interpretation of fitted pa- 
rameter values and predictions using a model that is 
a statistically poor match to the data therefore re- 
quires considerable care. One seeks models that are 
simultaneously scientifically relevant and provide an 
adequate statistical description of the data, and then 
statistical efficiency becomes an important consid- 
eration for drawing scientific conclusions from lim- 
ited data. Flexible modern inference methods facili- 
tate the development and statistical analysis of such 
models. I will discuss these issues in the context of 
Xia and Tong's analysis of Nicholson's blowfly data. 
Similar considerations arise in their measles exam- 
ple, and have been investigated by He, lonides and 
King (2010). 

Xia and Tong's APE(<1) estimate is equivalent 
to the MLE only for a specific choice of stochas- 
tic model. From their equation (3.12), we see that 
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APE(<1) corresponds to the MLE for additive, 
Gaussian, constant-variance process noise with no 
measurement error. For Xia and Tong's blowfly mo- 
del, the log-likelihood at the APE(<1) point esti- 
mate is —1568.5 whereas the log-likelihood at the 
APE(<T) point estimate is —1569.5. A chi-squared 
approximation indicates that a full likelihood-based 
analysis for this model should consider the APE(<1) 
and APE(<T) point estimates to be both statisti- 
cally plausible, since the difference of 1.0 log units 
is not large compared to typical values of 1/2 of 
a chi-squared random variable with five degrees of 
freedom. To check the extent to which either of these 
point estimates provides a reasonable statistical ex- 
planation of the data, I compared their goodness of 
fit with that of a simple phenomenological model. 
For oscillating populations, a log-ARMA model is 
an appropriate choice (He, lonides and King, 2010). 
I fitted a stationary log-ARMA model to the 9th 
through 200th data points for which predictions are 
made by Xia and Tong's model, in order to ensure 
that the resulting likelihood provides a fair compar- 
ison. A log-ARMA(2, 2) model gives a maximized 
log-likelihood of —1542.3 based on estimating six pa- 
rameters. Xia and Tong's mechanistic model there- 
fore explains the data considerably more poorly (e.g., 
judged by Akaike's information criterion) than this 
simple black-box model. Is it possible to preserve the 
scientific interpretability of Xia and Tong's model 
while also providing a statistically satisfactory ex- 
planation of the data? To address this question, I fit- 
ted a dynamic model adapted from Wood (2010) 
which has a similar structure to the model of Xia 
and Tong but differs by formulating the stochastic- 
ity in a scientifically motivated way. This alternative 
model is described in full in the Appendix below. 
I evaluated the likelihood by sequential Monte Carlo 
and computed the MLE by iterated filtering (lon- 
ides, Breto and King, 2006) implemented using the 
pomp package for R (King et al., 2010). Maximiza- 
tion over the six parameters led to a log-likelihood of 
— 1465.4. Figure 1 shows that the skeleton of this al- 
ternative model matches the periodicity in the data, 
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a measure of fit which Xia and Tong chose to em- 
phasize in their Figure 8. The likelihood at the MLE 
also comfortably outperforms the log-ARMA(2, 2) 
benchmark so subsequent analysis can consider the 
model to be adequately specified, at least to a first 
approximation. Of course, the possibility of poten- 
tial further advances in the model specification can- 
not be ruled out. Indeed, a careful and complete 
investigation would be expected to reveal some as- 
pects of this improved model that are a statistically 
significant mismatch to the data. 

The analysis above may suggest that choice of mo- 
del is more important than the specific choice of in- 
ference methodology. However, model development 
is facilitated by statistical methodology that is ap- 
propriate for general classes of models (so that the 
scientist is not constrained by the methodology when 
developing models) and which is convenient for quan- 
titative comparisons between models. Xia and Tong's 
APE(<1) and APE(<T) criteria are not appropri- 
ate for nonstationary, partially observed dynamic 
systems evolving in continuous time. These features 
are typical of ecological and epidemiological systems 
(Bj0rnstad and Grenfell, 2001). Likelihood is quite 
generally applicable in theory, though feature-ma- 
tching methodology has previously been advocated 
to avoid the practical numerical issues of working 
with the likelihood for dynamic models (Wood, 2010, 
and references therein). Recently, calculation and 
maximization of the likelihood function for general 
nonlinear, partially observed dynamic models has 
become computationally routine in many ecological 
dynamic systems (e.g., King et al., 2008; He, Ionides 
and King, 2010; Laneri et al., 2010). 

A criterion such as APE(<T) may help to empha- 
size certain low-frequency (long time scale) features 
of the data such as the periodicities in the blowfly 



population. While this may be of scientific interest 
component of a data analysis, it is not desir- 
able as a complete analysis due to the obverse prop- 
erty of suppressing high-frequency (short time scale) 
features. The efficiency of the MLE corresponds to 
an optimal balance between frequencies, in the spe- 
cific sense of minimizing asymptotic variance of pa- 
rameter estimates when the model is correct. This 
balance between frequencies is perhaps most clearly 
seen in the context of Whittle's approximation to 
the likelihood, discussed by Xia and Tong in Sec- 
tion 2.2. Although the usual decomposition of the 
likelihood for dynamic models appears to emphasize 
one-step prediction, the combination of all one-step 
predictions corresponds to an estimator which effi- 
ciently combines the contributions of all frequencies. 
I shall argue that high-frequency features may be 
potentially even more scientifically important than 
low- frequency features. 

Both the blowfly and measles examples involve 
analyzing mechanistic models that aim to explain 
the long-term dynamics of the system in terms of 
models constructed to describe the short-term in- 
crements or temporal derivatives (Brillinger, 2008; 
Breto et al, 2009). The APE(<T) estimate neces- 
sarily has a poorer fit than the one-step APE(<1) 
estimate, in a least squares sense, to the short-term 
behavior that provides the scientific rationale for the 
mechanistic model. Xia and Tong's blowfly example 
suggests that this property can lead to a greater sci- 
entific interpretability of the APE(<1) parameter 
estimates. I consider each parameter in turn: 

1. In the biological interpretation of Xia and Tong's 
model, c corresponds to the number of eggs laid 
per adult blowfly per bi-day that develop into 
adults in the absence of competition for food. 
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From data on eggs collected in this blowfly exper- 
iment (Brillinger et al., 1980) we see that this bi- 
ological quantity peaked at c « 20 in the troughs 
of the population cycles. This matches closely 
the estimate c\ = 20.1 via the APE(<1) criterion. 
The APE(<T) estimate, ct = 592, is an order of 
magnitude higher than this biological interpreta- 
tion permits. 

2. The original biological motivation for Xia and 
Tong's model had a = 1 (Gurney, Blythe and 
Nisbet, 1980) and a value slightly less than 1 
has been proposed when making a discrete-time 
approximation to a continuous dynamic system. 
The APE(<1) estimate a\ = 0.846 is consistent 
with this interpretation, whereas the APE(<T) 
estimate ax = 0.263 is so far below unity that it 
requires a reinterpretation of the biological story 
behind the model. 

3. Biologically, oNq is the adult population size that 
maximizes the total number of successfully- 
developing eggs laid. Empirically, the adult pop- 
ulation size maximizing total egg production oc- 
curred during troughs of adult abundance at suc- 
cessive values of 397, 542, 167, 2236, 2267, 539, 
1308, 2363, 3806 and 254 adults for the ten cycles 
analyzed. The APE(<1) estimate &iNo t i = 499 
and the APE(<T) estimate &t^o,t = 344 are 
both broadly consistent with this interpretation. 

4. 2/(1 — z^) may be biologically interpreted as the 
life expectancy of the blowfly adults. The esti- 
mates 2/(1 -i>i) = 8.33 and 2/(1 - v T ) = 5.67 are 
both broadly biologically plausible. Empirically, 
life expectancy decreased substantially when the 
adult population was large (Brillinger et al., 1980; 
Guttorp, 1981), and so one must permit some 
flexibility in the interpretation of the constant 
life expectancy assumed by this model. 

In conclusion, Xia and Tong's APE(<1) and 
APE(<T) fits to the blowfly data are statistically 
more-or-less equally valid. Both are handicapped by 
the substantial misspecification of the fitted model. 
The APE(<T) estimate fits the periodicity of the 
fluctuations better but at the expense of the biolog- 
ical interpretation of the fitted parameters. Supe- 
rior models can simultaneously satisfy each of these 
considerations. If the model is adequately specified, 
likelihood-based analysis provides a powerful set of 
tools for investigating the range of statistically plau- 
sible parameter values. If the model is poorly spec- 
ified, likelihood provides a powerful framework for 
diagnosing the misspecification and a flexible frame- 
work for constructing improved models. 



APPENDIX: AN ALTERNATIVE BLOWFLY 
DATA ANALYSIS 

Let N(t) be the number of adult blowflies at time t. 
Suppose that the number of newly emerging adults 
during the time interval [t , t + A] is Rt , and the num- 
ber of adults surviving from time t to t + A is St, so 
that N(t + A) = R t + St ■ Suppose that Rt and St are 
conditionally independent given N(t) and N(t — r) 
with conditional distributions 

R t ~ Poisson[iV(t - t)P exp{-N(t - t)/N } A e t ], 

S t ~ Binomial[iV(t),exp{-5A£i}]. 

Here, et and et are independent Gamma-distributed 
random effects with mean 1 having respective vari- 
ances dp A -1 and cr^A" 1 . When A = 2 day this mo- 
del is similar to the model of Xia and Tong, with 
parameters iVo and r having matching interpreta- 
tions and the remaining parameters translating to 
a = 1, c^2P and v exp(— 25). When A = 1 day 
this corresponds exactly to the dynamic model of 
Wood (2010). Wood (2010) employed a generalized 
method of simulated moments to estimate param- 
eters, but I shall instead construct a partially ob- 
served Markov process (POMP) model for which 
likelihood-based methods are available. 

Supposing that A is chosen to divide r, the above 
construction defines a discrete-time Markov process 
X(t) = (N(t),N(t - A),N(t - 2A),...,JV(t - r)). 
The choices A = 1 day and A = 2 day can then be 
viewed as Euler approximations to a continuous- 
time Markov process that is defined by taking the li- 
mit A -> (Breto et al, 2009). To complete a POMP 
model, one needs to specify initial conditions and 
a measurement process. Write Nicholson's recorded 
data as y\ , . . . , yx where yj~ gives the adult blowflies 
counted at time i/% = 2/cday, and T = 200. For com- 
parison with Xia and Tong, I fixed r = 14 day and 
required that the model should provide a likelihood 
for j/9, y io, ■ ■ ■ ,Vt- The initial state X(tg) can be con- 
structed using y±, . . . , y$. With A = 2 day, I chose to 
set N(tk) = Vk f° r k € {1, ... ,8} rather than treat- 
ing the initial conditions as unknown parameters. 
For general A, I specified X(t$) using a cubic spline 
interpolation of yi, . . . , j/g. 

My measurement model was ~ Negbinom(A r (ifc), 
cr~ 2 ), a negative binomial distribution conditional 
on N(tk) with mean N(tj-) and variance N(tk) + 
[<j y N(tk)] 2 . Nicholson's adult blowfly counts certain- 
ly contained some error due to an inconsistency be- 
tween the counts of dead adults and newly emerging 
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adults that were used to infer the counts of living 
adults (Brillinger et al., 1980). However, the uncer- 
tainty in the measurement model necessary to pro- 
vide a good statistical fit to the data has a more 
subtle interpretation. The fertility of adults varies 
according to their age and potentially for other un- 
modeled biological reasons. In the scientific motiva- 
tion of the process model, the process model for N(t) 
may be interpreted as describing fertility (in units of 
ideal, standardized adults) rather than simply mea- 
suring the actual number of adults. The measure- 
ment error then includes fluctuations in the calibra- 
tion between the actual number of adults present 
and their reproductive potential. 

Likelihood-based inference for POMP models us- 
ing iterated filtering has been described and discus- 
sed elsewhere (Ionides, Breto and King, 2006; King 
et al, 2008; Ionides, Breto and King, 2008; Breto 
et al., 2009; Bhadra, 2010; He, Ionides and King, 
2010; Laneri et al., 2010). This methodology involves 
employing sequential Monte Carlo techniques for eva- 
luation and optimization of the likelihood function. 
The dynamic process model enters the computations 
only through the generation of sample paths at vary- 
ing values of the parameters. Methodology enjoying 
this property has been called plug-and-play (Breto 
et al., 2009; He, Ionides and King, 2010) since it 
can be implemented simply by plugging simulation 
code for the process model into inference software. 
In particular, likelihood-based inference is possible 
even when the likelihood function itself can be eval- 
uated only by Monte Carlo methods. 

There was some indication that the alternative 
model fits better for A = 1 day (maximized log-likeli- 
hood of —1465.4) than for A = 2day (maximized 
log-likelihood of —1471.4). I did not investigate the 
introduction of an exponent a that Xia and Tong 
proposed to modify the effect of a large time dis- 
cretization step. One of the advantages of the POMP 
framework is that it applies to continuous-time pro- 
cess models, or models based on arbitrarily small 
time discretizations, which makes such modifications 
unnecessary (Breto et al., 2009). Here, there is little 
reason to prefer the analysis with A = 2 day to A = 
lday. The MLE for A = 1 day was P = 3.28 day" 1 , 
N = 680, 6 = 0.161 day" 1 , a v = 1.35 day 1/2 , a d = 

1/9 

0.747 day ' and a y = 0.0266. All parameters are 
seen to be consistent with the biological interpre- 
tation of the model. The measurement uncertainty 
parameter, a y , is estimated to be small so most of 
the stochasticity is assigned to variability in the dy- 
namic process. 
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